How to deal with natural language interfaces in a 
geolocation context? 
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In the geolocation field where high-level programs and low-level 
devices coexist, it is often difficult to find a friendly user inter- 
face to configure all the parameters. The challenge addressed 



^ in this paper is to propose intuitive and simple, thus natural lan- 

i guage interfaces to interact with low-level devices. Such inter- 

faces contain natural language processing and fuzzy represen- 
tations of words that facilitate the elicitation of business-level 
objectives in our context. 
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1 INTRODUCTION 

The question of geolocation is the core base of many companies working on 
geomarketing. Social networks with recommendations (nearby social events, 



X 

C^ nearby restaurants, etc.), determination of a customer profile depending on 

his choices, preferences, tastes, etc. are examples of growth sectors, espe- 
cially with the expansion of the mobile market that is rapidly growing. The 
company we are working with, is a leader in geolocation middleware and pro- 
poses systems for child location, vehicle and fleet tracking or delivery rounds. 
From the company point of view, persons, vehicles, goods are considered as 
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devices that have to be tracked with accuracy using a Geohub and notifica- 
tions must be sent according to type of required tracking. On the contrary, 
from the customer point of view, a service has to be proposed, according to 
a prior agreement but no (or few) technical detail(s) has (have) to be known. 
The Geohub coordinates geoinformation on a single platform to track devices 
and to give them orders knowing their position (e.g. what to do in case of traf- 
fic jam, or accident on the road, or if the vehicle has broken down, etc.). The 
Geohub interacts with all the tracked devices. Low-level messages (written 
in the Forth language) are sent by both the devices and the Geohub in "push" 
or "pull" mode. Two roles have to be distinguished: (1) the company who 
sells and connects devices to its hub and who configures the hub to satisfy 
the customer and (2) the customer who expresses his need (his fleet track- 
ing, for example) to an employee of the company that is able to configure the 
hub. Thus the employee has to understand the customer needs and elicit his 
preferences while configuring the Geohub. 

An important challenge for the company is to propose a smart interface, 
smart enough to let the customers configure the Geohub by themselves. The 
customers would have a phone conversation with a virtual assistant. We know 
that natural language processing (NLP) deals with NP-complete problems 
that is why such an interaction needs many contextual elements to limit the 
possibilities. However, even with context, the problem remains quite hard 
since it implies an important expertise to be able to translate the needs ex- 
pressed in a natural language into a set of Forth scripts and programs written 
in other languages. Thus there is a need for tools to capture these sometimes 
imprecise requirements (e.g. "I would like to be notified when one of my 
fleet vehicles arrives near the warehouse") and transform them into business 
processes. The fuzzy logic and its various methods and tools are of great help 
for such needs and especially the methods that deal with linguistic variables. 

The paper is organized as follows: we first review some NLP techniques 
and we focus on an interesting linguistic fuzzy model to deal with impre- 
cisions. Our approach that mixes NLP with fuzzy tools in the geolocation 
context is then explained. Finally, a use case is presented: it exhibits the use 
of a fuzzy semantic approach to activate an alert in a geolocation context. 

2 STATE OF THE ART 

Since the first techniques to deal with natural language, many methods and 
algorithms have been proposed to understand and disambiguate sentences [5 , 
[TD] [13] [12] ■ For example, decision trees and statistical models may give good 



results as soon as there is context enough in the speech corpus. In NLP, a 
formal grammar draws a set of normative rules to describe how the natu- 
ral language runs. Among the various formal grammars are the generative 
grammars, the transformational grammars and the tree-adjoining grammar 
(TAG) [9]. To make discourse analysis, part-of-speech tagging is often used 
to disambiguate words (e.g. "display" can be a noun, a transitive verb or an 
intransitive verb) [6|. Syntax is part of formal grammar that deals with rules 
for the structure of a string (distribution of words, noun and adjective agree- 
ments, etc.). Meaning and formal syntax are the basis of the comprehension, 
according to Chomsky [4|. Semantics is the study of the meaning of words 
or sentences in their context. The concept of meaning is often fuzzy because 
natural language may be imprecise if we consider that each word (or set of 
words) has more than one meanings (without taking into account the figura- 
tive sense that is something else again). In linguistics, the meaning of a vague 
or imprecise expression is a fuzzy set in the proper sense. What we call fuzzy 
semantics is an approach that uses fuzzy logic (Zadeh's sense) to express the 
fuzziness of the meaning. 

Zadeh was the pioneer researcher to propose approaches to deal with im- 
precision and uncertainty when he introduced the fuzzy set theory, the fuzzy 
logic and the concept of linguistic variables [14|. Since, many fuzzy mod- 
els have been proposed for computing with words but one seems the most 
appropriate in our case because it makes a simple correspondence between 
words and fuzzy scales: the 2-tuple fuzzy linguistic model |8). Each word 
or linguistic expression is translated into a linguistic pair (s,, a) where Sj is 
a triangular-shaped fuzzy set and a a symbolic translation. If a is positive 
then Si is reinforced else Si is weakened. If the information is perfectly bal- 
anced (i.e. the distance between each consecutive word is exactly the same), 
then all the s, values are equally distributed on the axis. But if not (which 
may happen when talking about distance, for instance, "almost arrived" and 
"close to" are closer to each other than "far" and "out of route") the Si val- 
ues may not be equally distributed on the axis. Another model has thus been 
proposed to deal with such information called multigranular linguistic infor- 
mation [7]. To perform the computations, linguistic hierarchies composed of 
an odd number of triangular fuzzy sets of the same shape, equally distributed 
on the axis, are used as a fuzzy partition. A word may have one or two lin- 
guistic hierarchy(ies) and the representation of a notion (such as "distance") 
can be made up of several hierarchies. Thus the fuzzy sets obtained are still 
triangular-shaped but not always isosceles triangular-shaped and the notation 
si permits to keep both the hierarchy and the linguistic term. See ifTTI for a 



deeper review of these models. 

Nevertheless, we have shown in recent papers that the 2-tuple model with 
unbalanced linguistic term sets don't allow to model any unbalanced scale. 
Indeed, when one linguistic expression is very far away from its next neigh- 
bor, the 2-tuple fuzzy linguistic model puts them artificially closer to ea- 
chother J2] [3). The model we have proposed that is of course inspired by 
the 2-tuple fuzzy linguistic model uses the symbolic translations a to gener- 
ate the data set. In our model, the 2-tuples are twofold. Except the first one 
and the last one of the partition, they all are composed of two half 2-tuples: 
an upside and a downside 2-tuple. In our context (geolocation) where the 
linguistic terms are usually unbalanced, this choice is quite relevant. 

3 HOW A LINGUISTIC 2-TUPLE MODEL CAN HELP NLP? 

Parts of Speech (PoS) recognition and the PoS tagging have been used to 
analyze the strings. The analysis is simplified using a semantic tagging be- 
cause the context (geolocation software) is known. As an example, let us 
take the following need: "I want to receive an alert when the vehicle 
gets very close to the warehouse". The tagging permits to obtain tokens 
with a certain grammar and meaning. "I" is grammatically a pronoun and 
has no particular meaning for us (it simply designates the customer), "alert" 
is grammatically a noun and has meaning ALERT, "gets" is grammatically 
a verb and has meaning ZONE_ENTRY. "very" is grammatically an adverb 
and has meaning FUZZY_MODIF_+. "close to" is grammatically an adjec- 
tive and has meaning DISTANCE. 

A tree using a simplified tree-adjoining grammar (TAG)-based is then cre- 
ated, where the root is an ALERT. Each leaf node represents the semantic 
tag of a token from the lexicon. In the example below POI means "point of 
interest" and ZOI "zone of interest". 

ALERT=TYPE, MOBILE, PLACE, NOTIFICATION 
TYPE=ZONE_ENTRY | ZONE_EXIT | CORRIDOR 

PLACE=TOWN | ADDRESS | POI | ZOI 

For each concept whose meaning may be imprecise or fuzzy, a fuzzy parti- 
tion with our 2-tuple model is performed by experts. As an example, if the ex- 
pert gives five labels to represent the concept DISTANCE, they are by default 
considered as uniformly distributed on their axis. Looking for synonyms (e.g. 



http : //www . crisco . unicaen . f r/cgi-bin/cherches . cgi that 
exhibits a French dictionary) gives a set of five synonym bags, one bag per la- 
bel. The distance between each label is computed using the number of shared 
synonyms in each bag. Resemblance rate matrices are then computed to de- 
termine both the order of the terms and the distance between them. Finally 
it is easy to construct the partition of the five unbalanced terms thanks to our 
2-tuple linguistic model. See Figure [T] where both partitions (top, Herrera & 
Martinez's one and down, ours) are shown. 
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FIGURE 1 

Unbalanced linguistic term sets: example of our partition model for distance. 



At the top of the figure, the partition seems perfectly balanced because 
the model forces to have a midterm (here, "CloseTo"). The rest of the terms 
are automatically placed from left to right. At the bottom of the figure, our 
partition is unbalanced because we used our method with semantics and syn- 
onyms. Of course the big difference between both models is that in our case 
fuzzy subset "Far" meets "OutOfRoute" with a membership degree lesser 
than .5 (.15 actually). But our model guarantees the minimal coverage prop- 



erty, as proved in [1], that is enough in the computations. 

The semantic tokens (e.g. "close to") are then expressed through our 
2-tuples and compared to the fuzzy partition. Adverbs such as "very" will 
modify their associated 2-tuple through the a translation value, that has to 
be seen as a modifier. We have implemented on Android platform a vo- 
cal interface application built with PhoneGap, an open-source development 
framework that uses the phone's vocal recognition API. 

In the following we propose a use case using both fuzzy models: Herrera 
& Martinez' one and ours. 

4 USE CASE AND COMPARISON OF THE FUZZY MODELS 

As a use case we keep the same example than before with an alert that is 
triggered depending on (1) the distance between the mobile and the arrival (a 
warehouse) that is subject to (2) a time tolerance and depending on (3) the 
battery level. If the time tolerance is high enough, the alert is not triggered 
when the mobile is quite far away from the arrival and the battery level is low 
(to save battery). The alert is triggered, even if the battery level is low, when 
the mobile drives out-of-route of the arrival. The software has been written 
in Java, using j Fuzzy Logic with the extension we have proposed [3] and the 
FCL (Fuzzy Control Language) specification (IEC 61131-7). The FCL script 
below uses three inputs and one output. 

FUNCTION_BLOCK Alertl 

VAR_INPUT Battery: LING; Distance : LING; TimeTolerance : LING; 

END_VAR 

VAR_OUTPUT 

AlertTrigger : REAL; 
END_VAR 
FUZZIFY Battery 

TERM S := pairs (Minimum, 0.0) (VeryLow, 10.0) (Low, 20.0) 

(Medium, 50.0) (High, 60.0) (VeryHigh, 80.0) 
(Maximum, 100.0) ; 
END_FUZZIFY 
FUZZIFY Distance 

TERM S := pairs ( InTheCenter, . 0) (VeryCloseTo, 200 . ) (Near, 400.0) 
(Far, 700.0) (OutOf Route, 1200 . 0) ; 
END_FUZZIFY 

FUZZIFY TimeTolerance 

TERM S := pairs (Minimum, 0.0) (Medium, 60.0) (Maximum, 120.0); 
END_FUZZIFY 

DEFUZZIFY AlertTrigger 

TERM NoAlert := trian 0.0 0.0 1.0; 
TERM Alert := trian 0.0 1.0 1.0; 
METHOD : COG; // 'Center Of Gravity' 



END_DEFUZZIFY 

RULEBLOCK Rules 

RULE 1: IF Battery IS Minimum AND Distance IS InTheCenter AND 

TimeTolerance IS Minimum THEN AlertTrigger IS NoAlert; 

RULE 28: IF Battery IS Maximum AND Distance IS Far AND 

TimeTolerance IS Minimum THEN AlertTrigger IS Alert; 

RULE 63: IF Battery IS Maximum AND Distance IS Far AND 

TimeTolerance IS Medium THEN AlertTrigger IS Alert; 

END_RULEBLOCK 

END FUNCTION BLOCK 



Two different partitions were used for the inputs: the 2-tuple partition from 
Herrera et al. and our 2-tuple partition (in the FCL extract, only our partition 
is shown through the pairs). A study and a comparison between both mod- 
els show that no alert is triggered with Herrera et al.'s model when (i) distance 
is between Near and Far (=600) and battery level is Maximum (=100); (ii) 
distance is Far (=700) and battery level is Maximum while an alert would 
have been triggered with our 2-tuple model. This is due to the fact that the 
unbalancement is weaker in this model than in ours. 



5 CONCLUSION 

This paper is a first approach to take into account natural language in a ge- 
olocation interface where an implementation has been proposed on Android 
platform. The imprecision, i.e. the fuzzy semantics of the sentences are dealt 
with a model inspired by the 2-tuple fuzzy linguistic representation model 
from Herrera & Martinez where unbalancement is the key concept. The par- 
tition is constructed while taking into account the semantics of the concepts 
(synonyms, resemblance rate matrices, etc.). In future works, we will for- 
malize this method that is empirical so far. We also want to introduce more 
contextual elements and ontologies to be able to understand more words. 
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